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ABSTRACT 

We constrain the densities of Earth- to Neptune-size planets around very cool 
(T e =3660-4660 K) Kepler stars by comparing 1202 Keck/HIRES radial velocity 
measurements of 150 nearby stars to a model based on Kepler candidate planet 
radii and a power-law mass-radius relation. Our analysis is based on the pre- 
sumption that the planet populations around the two sets of stars are the same. 
The model can reproduce the observed distribution of radial velocity variation 
over a range of parameter values, but, for the expected level of Doppler systematic 
error, the highest Kolmogorov-Smirnov probabilities occur for a power-law index 
a ~ 4, indicating that rocky-metal planets dominate the planet population in 
this size range. A single population of gas-rich, low-density planets with a = 2 is 
ruled out unless our Doppler errors are >5 m s _1 , i.e., much larger than expected 
based on observations and stellar chromospheric emission. If small planets are a 
mix of 7 rocky planets (a = 3.85) and I-7 gas-rich planets (a — 2), then 7 > 0.5 
unless Doppler errors are >4 m s _1 . Our comparison also suggests that Kepler's 
detection efficiency relative to ideal calculations is less than unity. One possible 
source of incompleteness is target stars that are misclassified subgiants or giants, 
for which the transits of small planets would be impossible to detect. Our results 
are robust to systematic effects, and plausible errors in the estimated radii of 
Kepler stars have only moderate impact. 

Subject headings: planetary systems — astrobiology — techniques: radial velocities 
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Introduction 



The discovery of planets around other stars has placed our Solar System in context and 
stimulated speculation on the frequency of habitable planets and life in the Universe. Very 
cool dwarf stars (with late K and early M spectral types) are of special significance to such 
investigations because the two principle detection techniques, Doppler radial velocity (RV) 
and transit photometry, are more sensitive to smaller planets around smaller stars. Such 
stars are also much less lumi nous than solar- type stars, the circumstellar habitable zone 



is closer feasting et al. 



detectable ( IGaidos et al. 



9931). and planets within the habitable zone are therefore more 



20071 ). These stars test models of planet for mation: for example, 



core-accretion mode 



Kennedy fc Kenyon 



s pre dict fewer gas giants and more " failed" cores (jLaughlin et al 



2004 : 



20081 ). consistent with the lower 



frequency of low-mass p lanets compared to G stars (IJohnson et al. 



frequency of giant planets and higher 



2007 



2008 



Mayor et al. 



Cumming et al. 



20091 ) . Finally, late K and early M dwarfs constitute three-quarters of 
all stars in the Galaxy, and their contribution weighs heavily in any cosmic accounting of 
planets or life. 

Most confirmed exoplanets have been found by the Doppler technique, which can 
detect planet s of a few Earth masses on short-period orbits around bright late F- to early 



K-type stars (IMayor et al. 



2009 



Howard et al. 



planets around very cool dwarfs (jZechmeister et al 



2010 



2010 



. There are also 



2009 



Forveille et al. 



Apps et al. 



2010 



er searches 



Bean et al. 



or 



20111 ). The CoRoT and Kepler missions have successfully extended 



the search for small planets to space using the transit technique. The Kepler spacecraft is 
monit oring ~150,000 star s, including approximately 24,000 K-type stars and 3000 M-type 



stars ( iBatalha et al. 



as small as ~0.8 R ffi (IBorucki et al. 



2010) , and has disco vered hundreds of candidate planets with radii Rp 



the completeness limit of Kepler ([Howard et al. 



2011). The distribution with R p peaks near 2R ffi and at 



20111 ). 
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In principle, the mass M p of a transiting planet can be uniquely determined by 
Doppler observations and mass and radii compared with theoretical relationships. The 



mean density of scores of giant planets and a handful of objects between the size of 



and Neptune orbiting nearby stars have 


)een determined in this 


Charbonneau et al. 


2009: 


Hartman et al. 


2011; 


Winn et al. 


2011 



Gillon et a 



Demory et al. 



larth 



2007 



20111). This 



technique has also successfully confirmed candidate planets around the b rightest CoRo 



Kepler stars, inclu ding two with masses only a few times that of Earth (IBatalha et al. 



Hatzes et al 



and 



2011 



20111 ). Comparison with a mass-radius relationship (MRR) can discriminate 
between denser planets composed of silicates and metal ("super-Earths"), and less dense 
planets with subst antial envelopes of ices and hydrogen-helium gas ("ocean planets" or 



"mini-Neptunes" ) ( jSeager et al. 



20071 ). However, the Doppler signal expected from many 
Kepler candidate planets is comparable to total instrument noise and stellar "jitter" 
(2-3 m s , Figured]). RV measurements can be "phased" to the transit-determined orbit, 
achieving greater sensitivity. Unfortunately, the great majority of cool Kepler stars are too 
faint (K p > 13) to achieve the required high SNR even using 10-m telescopes. 

Instead, Doppler observations of a sample of nearby, brighter stars can constrain the 
masses and mean densities of planets around corresponding Kepler stars, assuming both 
samples host the same planet population. Every planet will contribute to RV variance 
and the aggregate effect in excess of instrument errors and the noise from the stellar 
atmosphere ("jitter") can be detected. Given ifep/er-determined orbits and planet radii 
and a hypothetical MRR, the cumulative distribution of RV variation can be predicted and 
compared to that from the nearby population. For a given distribution of observed radii, 
denser, rocky planets will generate greater RV variation, while less dense, ice- or gas-rich 
planets will produce smaller variation. 



This approach exploits both the orbital information from Kepler and the collective RV 
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signal from the entire population. As with RV follow-up of individual transiting planets, 
planets are first detected by transit (Kepler), then characterized by Doppler observations. 
Kepler observations would provide an exact description of a equivalent nearby population 
only in the limit of an infinite sample, and thus the finite size of the candidate planet 
sample introduces uncertainty. We show that this uncertainty is not debilitating. This 
method also rests on two assumptions: (i) the planet populations of the Doppler and Kepler 
samples are statistically the same, and (ii) the MRR of small planets can be described by 
simple empirical relations. We discuss the validity of both of these assumptions. 

We carry out such a combined transit-Doppler analysis, predicting the s tatistical 



distri bution of RV variation in the M2K survey of late K and early M dwarfs (lApps et al. 



20101 ). We use the Kepler distribution of candidate planet radii, corrected for detection 
efficiency, and assume a single parametric MRR. We compare the predicted and observed 
distributions to constrain the MRR and hence the compositions of the small planets these 
stars host. 



Data 



Doppler survey: The M2K survey has obtained 1406 RV measurements of 172 late 
K and early M dwarfs, with at least 3 measure ments for each star. S tars were selected 



from the SUPERBLINK proper motion catalog (ILepine fc Sh ara 



and parallax- or proper-motion-based absolute magnitudes (ILepine fc Gaidos 



2005) based on V-J color 



20111 ). and 



confirmed by moderate-resolution spectroscopy. We excluded active stars with detectable 
emission in Ha or in the 90th percentile of emission in the HK lines of Ca II, and another 6 
stars with problematic template spectra. The remaining stars are not exceptionally active, 
with median R' HK = —4.70 and the vast majority have —5 < R' HK < —4.5 (see inset in 
Figure |3]). For stars with B — V ~ 1 these activity levels correspond to ages of 1-10 Gyr 
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( IMamajek &: Hillenbrand! 120081 ). Targets have apparent magnitudes of V — 8 — 12; most 
have V = 9-10. 

Doppler spectra are obtained wit 



the Keck I telescope ( jVogt et al. 



;h the red channel of the HIRES spectrograph on 
19941 ) . Exposure times are adjusted to achieve SNR = 
200. Absorption lines of molecular iodine are used as a rest-frame reference against which 
to measure the Doppler shift of features in the stellar spectrum. The shift is determined 
by minimizing the difference between the spectrum and a model combining an observed 



spectrum of the star without iodine and one o 



a B star (IMarcy fc Butler 



1992 



Butler et al- 



io di ne imposed on the featureless spectrum of 



19961 ). The error- weighted mean is subtracted 



from the measurements of each star and the RMS is calculated (Tabled]). 

The effective temperature T e of each star is estimated from the V-K color and an 
empirical relation 

logT e « 3.9653 - 0.164(V - K) + 0.0168(V - K) 2 , ( 



which has an accuracy of 1% (jBenedettd Il998l ) . We estimate stellar mass M* using an 



empirical relation log (M*/Mn) = 1.5 log (T e /5780) + 0.02 based on a Yale-Yonsei 5 Gyr 



isochrone (iDemarque et al 



2004]) 



the Spectroscopy Made Easy code (IValenti fc Piskunov 



he metallicites of 95 star s have been estimated using 



19961 ) . The standard deviation of 



[Fe/H] in our sample is ±0.21 dex, and the concomitant error in stellar mass due to the use 
of a solar-metallicity isochrone is ~0.02M Q , which we ignore. 

Kepler targets and planets: We use the Quarter 2 Kepler target li st from the 



Multim ission Archive (STScI). Kepler candidate planets are taken from 



Borucki et al. 



(120111 ) . who report R p based on stellar radius R*, the orbital period P, and the estimated T e 



and surface gravity log g of the host star. Stellar parameters are based on the multi-passban d 



photometry and Bayesian analysis of the Kepler Input Catalog (KIC) 
We consider only putative dwarf stars with 4 < log g < 4.9. 



Brown et al 



(2011). 
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Effective temperature range: We choose a T e range that includes a substantial number 
of stars from each sample and maximizes the similarity in the temperature distributions 
as assayed by the Kolmogorov-Smirnov (K-S) statistic. For an interval of 1000 K, that 
range is 3660-4660 K (K-S probability = 4.7 x 10~ 3 ). This includes 150 M2K stars (1202 
measurements) and 10,018 Kepler target stars, the latter having 138 candidate planets, 
and excludes the 410 very coolest Kepler target stars and 6 hotter M2K stars. The 
mean effective temperatures of the M2K and Kepler subsamples are 4230 K and 4200 K, 
respectively. The low K-S probablity reflects the narrower distribution of M2K stars within 
this range of T e compared to the Kepler sample (Figure [2]). We speculate on the possible 
impact of this difference on our analysis in Section |6j 



3. Model 

Planet frequency: The expected frequency of the ith planet candidate in the Kepler 
survey is 1/s,, where Sj = E^p^-g^, p^ is the geometric probability of a transiting orbit 
around the jth star, and qij is the probability of detection if the planet is on a transiting 
orbit. Si is the expected number of stars around which a planet would be detected, if every 
star had this planet on its particular orbit. For example, a planet that could have been 
detected around 100 stars, but has been found once, has a most likely occurence rate of 1%. 
For planets that are small compared to their host stars and on nearly circular orbits, the 
transit probability is: 

p = 0.238.F •p- 2/3 M- 1/3 R*, (2) 

where P is in days and F = T/P if P > T, where T is the observation period (120 d), 
or else F = 1. M± and R* are in solar units. A planet is detected if SNR = 8/ a > 7 



( jBorucki et al. 



20111 ). where 5 is the transit depth and a is the noise over the entire transit. 



In our Monte Carlo calculations (see below) only takes on values of or 1 depending on 
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whether SNR 7. Assuming uncorrelated noise, 



SNR 



5 NA 



(3) 



a 30 V 30 

where a^o is the noise per 30-minute integration, N is the number of observed transits, and 
A is the transit duration in minutes. The transit depth is 5 ~ 8.4 x l§~ b (R p / R*) 2 , where 
R p is in Earth units. The noise per 30-min integration as a function of Kepler magnitude 



K P is cr 30 « 10( A >" 13 )/ 5 " 4 flKoch et aL 



from the distribution in Figure 4 of 



2Q1 0f). W e mult iply this by a factor drawn randomly 



Koch et al. 



(120 fOl ) to account for stellar variability. 



Stars with factors > 10 are assigned a factor of 10. The number of observed transits is the 
largest integer less than T/P. (Three cases where P > T and N — 1 were confirmed by the 
Kepler team using later observations.) The transit duration for a circular orbit, averaged 
over all possible impact parameters, is 



t « 85#*Mr 1/3 P 1/3 min. 



(4) 



To account for incompleteness or overestimation of the detection efficiency of Kepler, we 
multiply Si by a constant parameter C, where < C < 1. We use a single, uniform value 
for detection efficiency both as a necessary simplification and because it can describe one 
possible cause of detection inefficiency - the presence of gi ant stars in the target list (Section 



H]). We do not correct for false positives, probably 5-10% (IMorton fc Johnson 



201 lh . C > 1 



is possible but unlikely if the false-positive rate is low, and we do not consider values of C 
< 0.2. 



Mass-radius relations: For the MRR of planets with R p > 3R ffi , i.e. 



large r, we use the masses and radii of 120 confirmed transiting planets (ISchneider et al. 



Neptune size or 



201ll ). M p is calculated using the mean density of the 8 such planets with radii closest 



to that of the Kepler object. Smaller planets with radii < 3R ffi are described by a 
single population with M p = R® (Earth units). Although the MRRs of solid planets 
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rock/ice/metal) a re not expected to precisely follow power laws (IFortney et al.l 12007 ; 



Seager et al 



20071 ). a power law with a ~ 3.85 is a reasonable approximation for a planet 



with an Earth-like ratio of silicates to metal, and little gas. If planets have acquired and 
retained a substantial H-He envelope, and the mass fraction of the envelope increases 
with M p , we expect a < 4. For example, M p ~ R 2 V descri bes a continuum be tween Earth 



Rogers et al. 



20111 ). Of course, 



and Neptune, and gas-rich super-Earths may have a < (] 
the small planets may be a mix of both rocky- and gas-rich objects and we entertain this 
scenario in Section HI 

Radial velocity errors: Appropriate modeling of RV errors is crucial to this analysis. 
The median standard deviation of formal (including Poisson) errors is 1.3 m s _1 , and 
we use the actual formal errors in our calculations. Fifteen pairs of RV measurements 
taken within 6 hr show additional total systematic error of ~ 3 m s^ 1 . We assume that 
additional systematic instrument errors and stellar noise ("jitter") are uncorrelated between 
observations and gaussian-distributed, but we examine the effect of correlated instrument 
errors in Section [5j For instrument noise we use a fixed RMS of 1.6 m s _1 based on 
observations showing this to be the "base ment" level of systematic noise among a large 



number of HIRES observations of K stars (llsaacson fc Fischer 



2010). Stars do not exhibit 



a monotonic level of jitter. Figure [3] shows the distribution of total systematic noise 
(instrument plus stellar jitter) predicted f or 100 M2K stars based o n their Ca II HK 



emission, B-V colors, and the equations in 



Isaacson fc Fischer! (120101 ) . We adopt a Rayleigh 



formula for the distribution of the jitter RMS cr* among all stars in the sample, 



exp 



(5) 



0~Q V ^0 

where we term o"o the magnitude of the jitter. (In Section we also try an exponential 
distribution.) The RMS jitter in an ensemble of stars with a Rayleigh distribution is 
V2cq. Our 6 hr systematic noise level of 3 m s _1 can be explained if <To = 1.8 m s _1 . The 
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predicted jitter distribution is best described by <7o = 1.7 ± 0.1 m s 1 (Figure [3]), consistent 
with our observations of 3 m s _1 total RMS . Additional noise due to stellar rotation and 



starspots may occur on longer timescales (IBarnes et al 



201 ll ). and we perform calculations 



with do over the range 1.5-4.5 m s _1 . However, we consider values near the upper limit, 
corresponding to an average systematic noise of 6.5 m s _1 , highly implausible because of 
the absence of active stars in in our sample (inset of Figure [3]). This is discussed further in 
Section [6j 

Radial velocity calculations: We predict the distribution of RV RMS for each set 
of parameter values by generating 10,000 Monte Carlo systems, with host stars selected 
with replacement from the M2K survey, and orbital inclinations drawn from an isotropic 
distribution. Each Kepler candidate planet has a probability 1/sj of being added to each 
star. This ignores any autocorrelation between the presence of planets. Masses are assigned 
to each planet using the Kepler radius and the MRR. We ignore all planet candidates with 
radii larger than the largest confirmed transiting planet (~2 Jupiter radii) as main sequence 
companions or false positives. The RV variation induced by each planet is calculated 



from the planet mass, host star mass, anc . 



be approximately coplanar ( jLissauer et al. 



syste m inclination. Orbits are assumed to 



201 ih . Radial velocities are calculated using 



the actual epochs of observations and random mean anomalies at the first epoch. We 
draw longitudes of perihelion from a unifor m distribution and orb ital eccentricities from 



a Rayleigh distribution with mean of 0.225 ( IMoorhead et al. 



20111 ). We add formal and 



systematic errors to the simulated radial velocities, subtract the error-weighted mean, and 
calculate the RMS. To filter binary stars, we remove observed and predicted systems whose 
RMS exceeds a specified cutoff B. 

Statistical comparison: The model and observed distributions are compared using the 
two-sided Kolmogorov-Smirnov (K-S) test and the two-sample Kuiper test; the latter is 
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sensitive to the tails of a distribution as opposed to the median. The four parameters of the 
model are the MRR parameter a, jitter magnitude cr , binary cutoff B, and completeness 
C . In Section H] we introduce a fifth parameter 7 that describes a mixed population of rocky 
and Neptune-like planets. 



4. Results 

A byproduct of our analysis is an estimate of the average number of planets per 
star: s" 1 . We find that 30% of Kepler stars with T e = 3660-4660 K have planet s with 



R e > 2R e and P < 50 d. This is in agreement with the findings of of 



Howard et al. 



(2011). 



The frequency of gi ant planets (R„ > O.S R.j) in our sample is 2.4%, close to that estimated 



in Doppler surveys (IJohnson et al 



2010J). This indicates minimal bias in our Monte Carlo 
reconstruction of the discrete Kepler sample because any effect should be most pronounced 
for the rarest (largest) planets. 

The observed cumulative distribution of RV RMS (points in Figure H|) has an 
accelerating rise below 3 m s _1 from gaussian noise, a logarithmic increase over 3- 
10 m s _1 from the combined effect of systematic error and planets not resolved by Doppler 
observations, and a tail beyond 10 m s -1 from giant planets and low-inclination binary 
stars. The best-fit models (e.g., solid line) agree with the observed distribution with a K-S 
probability >90%. The K-S and Kuiper statistics are largely congruent and hereafter we 
show only the former. 95% confidence intervals in the uncertainty due to the finite size of 
the Kepler planet sample were calculated using 200 bootstrap-resampled planet populations 
and are plotted as dashed lines in Figure HI (These illustrate deviations from the best-fit 
cumulative distribution, and are not cumulative distributions themselves, which can never 
reverse). The high RMS tail of the distribution contains few systems and is most poorly 
reproduced. 
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Jitter magnitude, completeness, and MRR parameter a influence the predicted 
distribution of RV variation in similar ways, and different combinations of parameter values 
can reproduce the observations. Values that produce high K-S probabilities describe a locus 
in C - a - do space. In contrast, our results are insensitive to the binary cutoff B for 
reasonable values; B = 110 m s -1 is used in all analyses. This value excludes 24 systems 
and implies that ~ 70% of M2K stars are single. This is an upper limit because M2K 
excludes known spectroscopic and close (< 2 arc-sec) binaries, and is not sensitive to wide 
(but unresolved) binaries. Th is fraction is intermediate the single star fraction of 40% for 



G stars and 80% for M stars (ILada 



200J) 



We first performed calculations assuming C = 1 and allowing a to vary from 2 to 5, and 
cr to vary from 1.5 to 4.5 m s _1 . This range of o"o is intended to capture the locus of high 
K-S probabilities over the entire plausible range of a; high values of <To clearly contravene 
our observations and predictions based on chromospheric emission (Figure [3]). Contours of 
0.01, 0.05, 0.1, and 0.5 probability, corresponding to confidence intervals of 99%, 95%, 90% 
and 50%, are plotted in Figure [5h- The location of maximum K-S probability is marked as 
an "x" , but we caution against overinterpretation of this location because even the contour 
of lowest confidence (50%) is very broad. 

If the detection efficiency is near unity for these stars, agreement between Kepler and 
Doppler observations favors a high a but also demands implausibly high values of ao- 
Better reconciliation between Kepler and Doppler can be achieved if Kepler detections are 
incomplete relative to the idealized calculations for these stars, i.e. if C < 1. If C — 0.5, 
then o"o ~ 2 m s _1 permits values of a ~ 4, but not much lower values: values of a < 2 
are possible only if cr > 3.2 m s" 1 (total systematic noise > 4.8 m s _1 ) at 99% confidence 
(Figure [5b). 



One cause of C < 1 may be interloping giant stars in the target list ( IBasri et al. 



201lh . 
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Giant stars have radii > 1O_R , and the transits of planets even as large as Neptunes will be 
< 12 ppm and und etectable by Kepler, especially with the confounding effect of oscillations 



(IHuber et al 



201fJ). There is indirect evidence for such contamination in the distribution of 
planet candidates with stellar colors. Figur e [6] plots the q — r (SDSS) and J — K (2MASS) 



colors of Kepler target stars from the KIC (iBrown et al 



201ll ). Yellow and red points have 



estimated surface gravities 4 < log g < 4.9 (putative main sequence stars) and log g < 4 
(putative subgiants and giants), respectively. Black contours are lines of constant (sub)giant 
fraction. Purple points mark candidate planet hosts. The green contour encloses 90% of 
stars with T e =3660-4660 K. Planet-hosting stars are conspicuously sparse in the vicinity of 
J — K pa 0.7 and g — r 0.9, where the fraction of (sub)giants exceeds 50%. Many putative 
K dwarf stars in this region of color space may instead be misclassified (sub)giants, with 
much larger radii and higher variability. 

We also evaluated the range of (cr ,a) parameter space over which the specific scenarios 
of rock-metal planets (a = 3.85) and gas/ice-rich planets (a = 2) are allowed (Figures [5fc 
and d). The former is permitted by a plausible range of (Tq for C < 1, with C = 0.4-0.5 
being most consistent with our Doppler data. All cases with a = 2 are ruled out at >95% 
confidence as long as C > 0.2 and ctq < 2.4 m s _1 (total systematic noise < 3.8 m s _1 ). 



Sm all planets may instead com prise an admixture of rocky, ice-rich, and gas-rich 



worlds. 



Wolfgang fc Laughlinl (120111 ) find evidence for a mixed population around solar-type 



stars. We considered this scenario by assuming that the population consists of a mixture of 
a = 3.85 and a = 2 planets with frequency 7 and 1 — 7, respectively. The K-S probability 
distribution vs. cr and 7 is plotted in Figure [3 The maximum K-S probability (93%) 
occurs for 7 = 0.88 and o"o = 2.8 m s _1 , but a range of correlated 7 and do values are 
possible. If o"o is not much larger than 2 m s _1 then values of 7 near unity are clearly 
favored, and if <Jq < 2.6 m s _1 (<4 m s _1 total systematic Doppler error) then 7 > 0.5 at 
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99% confidence. 



5. Sensitivity to Model Assumptions 

We performed a series of calculations to test the sensitivity of our results to some 
assumptions of the model. We considered the C = 0.5 case, and thus outcomes should be 
compared to Figure [5b • 

Distribution of jitter RMS: We replaced the Rayleigh distribution of jitter RMS <7o with 
an exponential distribution, while maintaining the same ensemble RMS. This modification 
shifts the locus of acceptable models to values slightly lower values of a and sightly higher 
values of <t (Figure Ek), but otherwise does not significantly impact our results. 

Correlated noise: Correlated or "red" instrument noise in Doppler observations does 
not decrease as the square root of the number of measurements, making de novo detections 
of signals comparable to such noise very difficult. This has little impact on our results 
because they rely on Kepler for planet detections and we analyze only the variance (total 
power) of the RV, a quantity independent of the noise spectrum. Correlated noise would 
only be important if there was significant drift of HIRES measurements on timescales 

months to years) . Long-term monitoring 



Apps et al. 



feoioh . We further tested the 



longer than the timespan of our measurements 
of RV-stable stars rules out such behavior, e.g. 
possible effect of red noise on our analysis by modeling instrument noise as correlated with 
a power spectrum exp(-UT), where r is the noise coherence time. Uncorrelated ("white") 
noise values Wi at times are replaced by "red" noise values where 

n=y ^ — 2, (6) 

^l + [2(t„-t,)/r] 2 ' 

and the sum is over all observations, whether they are of a given star, or not. In calculating 
reddened instrumental noise, we use the actual epochs of the observations tj. Errors are 
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then re-normalized to keep the variance the same. The coherence time of HIRES instrument 
noise is not known but we assume r = 20 d. Figure |Sb shows that the impact on our results 
is very small. 

Random errors in KIC radii: Inferences about densities and a mass-radius relationship 
depend sensitively on Kepler's estimates of planet radii, which are uncertain. To investigate 
the effect of random errors, we added gaussian-distributed errors with 25% RMS to the 
Kepler radii. This modification broadens the locus of acceptable parameter values and 
shifts the best-fit models to slightly lower a and slightly higher jitter, but otherwise does 
not significantly impact our results (Figure [Sb). 

Systematic errors in KIC radii: The astroseismically-determined radii of many 
Kepler solar-type s tars are systematically larger (a median of 20%) than KIC estimates 



( IVerner et al. 



20111 ). If this were also the case for the late K and early M stars in our sample, 
the planets they host would be larger by the same amount, and hence less dense. If the 
effect is uniform, the inferred frequency of planets, which depends mostly on detectability, 
transit depth and hence the ratio of radii, is largely unchanged. We investigated this 
scenario by increasing the radii of all stars and planets by 20% (Figure (HU). Larger planet 
radii and lower densities s hift the locus of permis sable a and o"o to only slightly lower 



values. On the other hand, 



Muirhead et al. 



(120111 ) point out that a stellar evolution model 



predicts consistently smaller radii for planet-hosting M dwarfs compared to KIC estimates. 
A running median of KIC radii vs. effective temperature, reduced by 15%, is roughly 
consistent with a Yale-Yonsei 5 Gyr solar-metallicity isochrone. We therefore performed a 
second analysis in which star and planet radii were uniformly decreased by 15% (Figure 
HJl). As expected, this shifts the locus to both higher a and cjq. As we discuss below, 
systematic overestimation of stellar radius and the presence of interloping giant stars may 
not necessarily be incompatible. 
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6. Discussion 



Our combined analysis of Kepler transit detections and Doppler radial velocities for 
late K and early M stars finds that consistency is possible for a wide but not unlimited 
range of parameters. As expected for an analysis based on RV variance, there is an inverse 
relationship between acceptable values of planet mass, i.e., the power- law index a of the 
planet mass-radius relation, and stellar jitter, i.e. the parameter <To that characterizes its 
distribution among stars. However, if the level of radial velocity jitter in M2K stars is as 
expected, reconciliation of Kepler and Doppler observations can only be achieved if a ~ 4, 
and a ~ 2 is excluded. In other words, small planets around these stars are primarily 
rocky-metal "super-Earths" rather than hydrogen gas-rich "mini-Neptunes" . We cannot 
absolutely rule out higher jitter (cro > 3 m s _1 , corresponding to total systematic RMS 
> 4.5 m s _1 ) that would admit a lower value of a, but there is no evidence to support such a 
choice. Instead, a ~ 2 m s _1 is supported by the RMS of our paired Doppler observations, 



the predicted stellar jitter based on chromospheric activity and the o 



among other, similar stars ( lApps et al 



2010 



Isaacson fc Fischer 



Dserved levels of jitter 



20101 ). Our choice of a = 2 



to represent gas-rich planets is conservative because theoretical 



closer to zero or even negative over the mass range of interest ( iRogers et al. 



modeling suggest s values 



201ll). 



Reconciliation of Kepler and Doppler data, even with a ~ 4, also appears to require 
that Kepler's detection efficiency be less than unity and perhaps ~50%. Some of this 
incompleteness could arise if many target stars are misclassified subgiant or giant stars 
around which Neptune-size or smaller planets are difficult or impossible to detect by Kepler. 
Spectroscopic follow-up finds that essentially all late K and M Kepler stars brighter than 
K p = 14 are giants (Mann et al., in prep.); we estimate the rate of interlopers in our 
sample of Kepler targets to be at least 15%. Giant interlopers are rare among the transiting 



planet-hosting Kepler stars ( jMuirhead et al. 



20111 ) because the vast majority of planets 
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are smaller than Jupiter and not detectable around giant stars. Additional incompleteness 
could come from higher stellar variability. 

Our analysis appears robust to the precise choice of function for the distribution 
of jitter RMS among stars, as long as the overall noise variance is conserved. It is also 
insensitive to the presence of correlated or "red" noise in the Doppler RV data. Although 
our results are not overly susceptible to random errors in estimated stellar radii, they 
do vary with uniform systematic errors in those values. If radii have been uniformly 
overestimated, as comparisons with stellar evolution models suggest, agreement between 
Kepler and M2K statistics favors a slightly higher value of a, reinforcing our conclusion that 
the small planets around these stars are primarily rocky. Although the resulting offset of 
the locus with a may seem small, one property of a power-law MRR is that a compensatory 
fractional change in index a will equal the fractional magnitude of a systematic change in 
radius, modulo a logarithmic factor which is approximately unity. For example, if radii are 
15% smaller then a should be 4.6 instead of 4. 

Systematic underestimation of stellar radii can be reconciled with the presence 
of interloping giant stars by accounting for strong selection effects among stars with 
transit-detected planets: Jus t as transit surveys of a given set of stars are biased towards the 



largest planets ([Gaud 



20051 ). a given set of planets will be more readily detected by transit 
around the smallest stars in a sample; stars with detected planets are thus not necessarily 
representative of the entire sample. Reliable estimates of the radii of a presentative sample 
of late-type Kepler target stars should be vigorously pursued. 

Our analysis is predicated on statistically indistinguishable planet populations in our 
samples of stars from the Kepler field and solar neighborhood. A plausible condition for 
this assumption is that the two samples have similar mass and metallicity distributions 
and be drawn from the same stellar population. The effective temperature distributions 
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are similar, but not identical (Figure [2]) and this may translate into differences in stellar 
mass. There is an excess of about 30 M2K stars (~20% of the sample) around 4200 K and 
a deficit around 3850 K. According to a Yale-Yonsei 5 Gyr solar-metallicity isochrone, this 
350 K increase corresponds to changing the stellar mass from 0.57M^ to 0.65M^. A dopting 



Johnson et al. 



f l2010h at face 



the relation between stellar mass and giant planet frequency of 
value, these M2K stars would have a 14% higher incidence of giant planets, but the giant 
pl anet frequency in the overall sample would only be 3% higher. According to Equation 9 



in 



Howard et al. 



(1201 if ) the frequency of all planets would decrease by 6%. 



M2K stars are all within 45 pc of the Sun, and the median distance is 25 pc, 
placing them well within the galactic disk. These stars are drawn from a proper 
motio n-selected catalog (>40 mas yr _1 ) with a transverse velocity limit of 8.6 km s _1 at 



45 pc (ILepine fc Shara 



20051 1 . The velocity dispersion of stars in the solar neighborhood is 



anisotropic but a rough estimate of 80% completeness at 45 pc is obtain ed by assuming an 



isotropic distribution with a dispersion of 25 km s 1 ( jBond et al 



20101 ). The correlation 



between metallicity and velocity dispersion (via age) means that this sample will be 
biased against metal-rich stars, but this effect is very small: Stars with [Fe/H]=-0.5 (more 
metal-poor stars are very uncommo n) have a veloc ity dispersion ~5 km s _1 higher than 



their solar metallicity counterparts (ILee et al. 



20111 ). and the corresponding completeness is 



~84% at 45 pc. The bias against solar-metallicity stars in M2K is therefore <5%. Although 
we excluded the most active stars from the analysis and may have removed any very young 



stars, this should not affect the meta 



is flat in this range ( IHolmberg et al 



licity distribution because the metallicity-age relation 



20071 ). Tidal de cay of the orbits of 



around small stars is expected to be extremely slow ( Uackson et al. 
appreciably evolve a planet population. 



ow mass planets 



20091 ) and would not 



The kinematics and metallicities of Kepler field stars have yet to be established. The 
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center of the Kepler field (I = 77°, b = +13°) is nearly perpendicular to the direction 
to the galactic center, and approximately parallel to the galactic plane. We estimate 
photometric distances from Kepler derived temperatures using the empirical relation for 
absolute magnitude Mj ps 6.25 — 16.53 log(T e /4000). At the median estimated distance 
of the subsample (256 pc), a star at the center of the Kepler field has approximately the 
same galactocentric distance as the solar neighborhood and is only ~60 pc above the 



galactic plane; most stars should belong to the thin disk and ha ve near-solar metal 



icities. 



Consistent with this, the TRILEGAL stellar population model (jVanhollebeke et al.l 120091 ) 
predicts that only 7% of stars in this range of effective temperature and magnitude belong 
to the thick disk or halo, and only 6% have [Fe/H] < —0.5. Thus we conclude that the 
Kepler and M2K samples are very similar in mass, metallicity, and age. 



Howard et al. 



(120111 ) estimated planet densities by comparing distributions of Kepler 



radii with masses from a Doppler survey of solar-type stars. 



They inferred a hig 



her 



Wolfgang &: Laughlin 



(2011). 



density for the smallest planets, consistent with our findings, 
using a different set of Doppler-detected planets, also concluded that the majority of 
small planets around solar-type stars are rocky. They also found that the proportion of 
low density, gas-rich planets increases with planet size, a feature essentially intrinsic to 
our analysis because of our choice of a = 2. Theoretical models predict the formation 



of inner, rocky planets ( [Raymond et al 



primordial hydrogen atmospheres ( IPierrehumbert fe Gaidos 



2004) , and the ste 



ar UV -driven escape of any 



201 ll ) . The low density of the 



short-period super-Earth GJ 1214b can be 



(Charbonneau et a 


2009 


2010a: 


Desert et al. 


2011) 



Croll et al. 



explained by a substantial hydro gen envelope 



20111 ). but also by a thick H 2 shell ( IBean et al 



20111 ) . Its host is a cooler (3000 K), much less lumi nous mid-M star and 



this may permit retention of hydrogen (IPierrehumbert fc Gaidos 



201lh . Gas- and ice-rich 



planets resembling GJ 1214b may be the exception rather than the rule around the coolest 
Kepler target stars. Refinement of Doppler systematic errors and the properties of Kepler 
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target stars, specifically the radii of K and M dwarfs and the fraction of interloping giants, 
will permit more robust constraints. 

This research was supported by NSF grants AST-09-08406 (EG), AST-10-36283 
(DAF), AST-09-08419 (SL); NASA grants NNX10AI90G and NNX11AC33G (EG), and the 
NASA KPDA program (DAF). The Kepler mission is funded by the NASA Science Mission 
Directorate, and data were obtained from the Multimission Archive at the Space Telescope 
Science Institute, funded by NASA grant NNX09AF08G. We thank Andrew Howard for his 
help with screening Doppler template spectra. 



- 21 - 

REFERENCES 

Apps, K., et al. 2010, PASP, 122, 156 

Barnes, J. R., Jeffers, S. V., & Jones, H. R. A. 2011, MNRAS, 412, 1599 
Basri, G., et al. 2011, AJ, 141, 20 
Batalha, N. M., et al. 2010, ApJ, 713, L109 
Batalha, N., et al. 2011, ApJ, 729, 27 

Bean, J. L., Seifahrt, A., Hartman, H., Nilsson, H., Wiedemann, G., Reiners, A., Dreizler, 
S., & Henry, T. J. 2010, ApJ, 713, 410 

Bean, J. L., Kempton, E. M.-R., & Homeier, D. 2010a, Nature, 468, 669 

Benedetto, G. D. 1998, A&A, 871, 858 

Bond, N. a., et al. 2010, ApJ, 716, 1 

Borucki, W. J., et al. 2011, ApJ, 736, 19 

Brown, T., Latham, D., Everett, M., & Esquerdo, G. 2011, AJ, 142, 112 

Butler, R. P., Marcy, G. W., Williams, E., McCarthy, C, Dosanjh, P., & Vogt, S. S. 1996, 
PASP, 108, 500 

Chabrier, G., & Baraffe, I. 2000, ARA&A, 38, 337 

Charbonneau, D., et al. 2009, Nature, 462, 891 

Croll, B., Albert, L., Jayawardhana, R., Miller-Ricci Kempton, E., Fortney, J. L., Murray, 
N., & Neilson, H. 2011, ApJ, 736, 78 



-22 - 

Cumming, A., Butler, R. P., Marcy, G. W., Vogt, S. S., Wright, J. T., & Fischer, D. A. 
2008, PASP, 120, 531 

Demarque, P., Woo, J., & Kim, Y. 2004, ApJSS, 155, 667 

Demory, B.-O., et al. 2011, A&A, 533, A114 

Desert, J.-M., Kempton, E. M.-R., Berta, Z. K., Charbonneau, D., Irwin, J., Fortney, J., 
Burke, C. J., & Nutzman, P. 2011, ApJ, 731, L40 

Fortney, J. J., Marley, M. S., & Barnes, J. W. 2007, ApJ, 659, 1661 

Forveille, T., et al. 2011, A&A, 526, A141 

Gaidos, E., Haghighipour, N., Agol, E., Latham, D., Raymond, S. N., & Rayner, J. 2007, 
Science, 318, 210 

Gaudi, B. S. 2005, ApJ, 628, L73 

Gillon, M., et al. 2007, A&A, 472, L13 

Hartman, J., et al. 2011, A J, 728, 138 

Hatzes, A., et al. 2011, arXiv: 1105.33721 

Holmberg, J., Nordstrom, B., & Andersen, J. 2007, A&A, 537, 519 

Howard, A., et al. 2011, arXiv:1103.254T1 

Howard, A., et al. 2010, Science, 330, 653 

Huber, D., et al. 2010, ApJ, 723, 1607 

Isaacson, H., & Fischer, D. 2010, ApJ, 725, 875 

Jackson, B., Barnes, R., & Greenberg, R. 2009, ApJ, 698, 1357 



-23 - 

Johnson, J. A., Butler, R. P., Marcy, G. W., Fischer, D. A., Vogt, S. S., Wright, J. T., & 
Kathryn, M. G. 2007, ApJ, 670, 833 

Johnson, J., Aller, K., Howard, A., & Crepp, J. 2010, PASP, 122, 233 

Kasting, J. F., Whitmire, D. P., & Reynolds, R. T. 1993, Icarus, 101, 108 

Kennedy, G. M., & Kenyon, S. J. 2008, ApJ, 682, 1264 

Koch, D., et al. 2010, ApJ, 713, L79 

Lada, C. J. 2006, ApJ, 640, L63 

Laughlin, G., Bodenheimer, P., & Adams, F. C. 2004, ApJ, 612, L73 

Lee, Y., Beers, T., An, D., Ivezic, Z., & Just, A. 2011, ApJ, 738, 187 

Lepine, S., & Gaidos, E. 2011, AJ, 142, 138 

Lepine, S., & Shara, M. M. 2005, AJ, 129, 1483 

Lissauer, J., et al. 2011, arXiv: 1102.05431 

Mamajek, E., & Hillenbrand, L. 2008, ApJ, 687, 1264 

Marcy, G., & Butler, R. 1992, PASP, 104, 270 

Mayor, M., et al. 2009, A&A, 507, 487 

Moorhead, A., et al. 2011, ApJSS, 197, 1 

Morton, T., & Johnson, J. 2011, ApJ, 738, 170 

Muirhead, P., Hamren, K., Schlawin, E., Rojas-Ayala, B., Covey, K., & Lloyd, J. 2011, 
la7Xrv:1109.1819l 



Pierrehumbert, R., & Gaidos, E. 2011, ApJ, 734, L13 

Raymond, S. N., Quinn, T., & Limine, J. I. 2004, Icarus, 168, 1 

Rogers, L. A., Bodenheimer, P., Lissauer, J. J., & Seager, S. 2011, ApJ, 738, 59 

Schneider, J., Dedieu, C, Le Sidaner, P., Savalle, R., & Zolutukhin, I. 2011, A&A, 532, 
id.A79 

Seager, S., Kuchner, M., Hier-Majumder, C. A., & Militzer, B. 2007, ApJ, 669, 1279 
Valenti, J., & Piskunov, N. 1996, A&ASS, 118, 595 

Vanhollebeke, E., Groenewegen, M. A. T., & Girardi, L. 2009, A&A, 498, 95 
Verner, G. A., et al. 2011, ApJ, 738, L28 

Vogt, S., et al. 1994, in Proc. SPIE Instrumentation in Astronomy VIII, ed. D. L. Crawford 
& E. R. Craine (SPIE), 362 

Winn, J., et al. 2011, ApJ, 737, L18 

Wolfgang, A., & Laughlin, G. 2011. larXiv: 1108.58421 

Zechmeister, M., Kiirster, M., & Endl, M. 2009, A&A, 505, 859 



This manuscript was prepared with the AAS L^Tj^X macros v5.2. 



-25 - 




Fig. 1. — Kepler candidates plotted by orbital period and planet radius R p , and contours 
of constant radial velocity variation (RMS = 2, 6, 20, 100 m s _1 ) assuming a circular orbit, 
orbital inclination of 60°, and a mass given either by an average of confirmed transiting 
planets with similar radius (if R p >3R e ) or proportional to radius squared (if R p <3R©). 
If the small planets are rocky then mass will be higher (proportional to Rp) and the RV 
RMS contours will be lower. Although many Kepler planets would be very difficult to 
individually detect (RMS < 6 m s _1 ), they will aggregately contribute to significant RV 
variation, especially if they are composed primarily of rock and metal. 
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Fig. 2. — Distribution of effective temperatures in the M2K Doppler survey (bars) and Kepler 
Quarter 2 target catalog (dotted line) in the range 3660-4660 K. This range was chosen to 
maximize the similarity between the distributions as measured by the Kolmogorov-Smirnov 
test. 
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Fig. 3. — Predicted systematic error (instrument and stellar jitter) of Doppler measurements 
for 100 M2K stars based on B — V color and emission in the H and K lines of Ca II. The 
dashed line is a best-fit noise model with a uniform instrument error of 1.6 m s~ l RMS added 
in quadrature to Rayleigh-distributed stellar jitter with o~o = 1.7 m s -1 . The distribution of 
Ca II HK emission, parameterized by the R' HK index, is plotted i n the inset. The median 



R' hk °f the sample is -4.70, corresponding to an age of about 3 Gyr ( iMamajek fc Hillenbrand 
200J). 
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Fig. 4.— Cumulative distribution of RV RMS (points) in M2K stars with T e = 3660-4660 K. 
The solid line is a model based on Kepler radii with Rayleigh-distributed systematic noise 
(<Jo = 2.6 m s" 1 ), Kepler detection efficiency factor C = 0.5, binary cut-off B = 1 10 m s _1 , 
and power-law mass-radius relation with index a = 3.85 for planets with R p < 3R e . The 
observations and model have a two-sided Kolmogorov-Smirnov probability of 93% that they 
could be drawn from the same population. The dashed lines are 95% confidence intervals 
for uncertainties generated by the finite size of the Kepler sample. These intervals are 
illustrative; cumulative distributions do not reverse. 
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Fig. 5. — Kolmogorov-Smirnov probability for our Kepler/M2K comparison as it varies 
with jitter parameter a , mass-radius relation power law index a, and Kepler detection 
efficiency C. Contours (lightest to darkest) are K-S probabilities of 0.01, 0.05, 0.1, and 0.5, 
representing confidence intervals of 99%, 95%, 90%, and 50%. The X marks the location of 
maximum probability, and the vertical dashed line marks a — 1.8 m s _1 , the value derived 
from observations and expected based on the distribution of Ca II HK emission among M2K 
stars. Panel (a) plots K-S probabilities with <To and a assuming a Kepler detection effiency 
of C = 1; (b) same as (a) but with C = 0.5; (c) distribution with a and C for a rocky planet 
MRR (a = 3.85); (d) same as (c) except for a notional gas-rich planet MRR (a = 2). 
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Fig. 6. — Color-color (SDSS g — r and 2MASS J — K) diagram of Kepler target stars. 
Yellow and red points represent stars with estimated log g > 4 (putative dwarfs) and log g < 
4 (putative giants or subgiants), respectively. Black contours are of constant (sub)giant 
fraction and the green contour encircles 90% of stars with T e = 3660 — 4660 K. The large 
purple points are the host stars of planet candidates. The main sequence and giant branches 
intersect in the region of color-color space occupied by late K stars. This region appears to 
be deficient in planet candidates and those present are found where the (sub)giant fraction 
is least. This suggests that many putative dwarf stars in this region may be misclassified 
subgiants or giants, around which planets would be more difficult or impossible for Kepler 
to detect. 
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Fig. 7. — Kolmogorov-Smirnov probabilities vs. jitter magnitude <j and fraction 7 of rocky 
(a = 3.85) planets vs. gas-rich (a = 2) planets, assuming a Kepler completeness C = 0.5. 
Countours (lightest to darkest) are K-S probabilities of 0.01, 0.05, 0.1, and 0.5, representing 
confidence intervals of 99%, 95%, 90%, and 50%. The X marks the location of maximum 
probability. The vertical dashed line at 1.8 m s _1 marks the expected value of a for M2K 
stars. 
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Fig. 8. — Sensitivity of our results to different assumptions in the model used to translate 
Kepler radii into Doppler radial velocity variance. All calculations assume C = 0.5 and 
these plots should be compared to Figure [5b. In (a) we use an exponential rather than a 
Rayleigh function to describe the distribution of jitter RMS among stars. In (b) instrument 
noise is modeled as being correlated on a timescale of 20 d. In (c) 25% gaussian error is 
added to Kepler stellar radius estimates. In (d) the radii of Keper stars (and planets) are 
uniformly increased by 20% or decreased by 15% from KIC values (heavy line). See text for 
justification of these choices. For clarity, only the 90% confidence contours are shown in the 
last panel. 
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Table 1. Radial velocity statistics of 150 stars with T e =3660-4660 K in the M2K Survey 



Star Measurements Stand. Dev. Formal Error 

m s _1 m s _1 

1 18 3.19 1.30 

2 8 5.42 1.58 

3 9 2.94 1.07 



Note. - Table [T] is published in its entirety in the 
electronic edition of the Astrophysical Journal. 



